Extending, Trimming and Fusing WordNet for Technical Documents

نویسنده

  • Piek Vossen
چکیده

This paper describes a tool for the automatic extension and trimming of a multilingual WordNet database for cross-lingual retrieval and multilingual ontology building in intranets and domain-specific document collections. Hierarchies, built from automatically extracted terms and combined with the WordNet relations, are trimmed with a disambiguation method based on the document salience of the words in the glosses. The disambiguation is tested in a cross-lingual retrieval task, showing considerable improvement (7%-11%). The condensed hierarchies can be used as browse-interfaces to the documents complementary to retrieval.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FAQFinder with sense tagging FAQFinder without sense tagging

Rejection Recall FAQFinder with sense tagging FAQFinder without sense tagging Figure 4: Recall vs. Rejection for FAQFinder with and without WordNet Sense Tagging search. In FAQFinder, sense tagging and calculation of semantic similarity are much more computationally intensive than term vector processing. However, since FAQFinder matches single questions rather than entire documents, the computa...

متن کامل

حس‌نگار : شبکه واژگان حسی فارسی

Awareness of others' opinions plays a crucial role in the decision making process performed by simple customers to top-level executives of manufacturing companies and various organizations. Today, with the advent of Web 2.0 and the expansion of social networks, a vast number of texts related to people's opinions have been created. However, exploring the enormous amount of documents, various opi...

متن کامل

Small Is Powerful! Towards a Refinedly Enriched Ontology by Careful Pruning and Trimming

In this paper, we study how to better merge a WordNet-like ontology with an online encyclopedia. We first eliminate the noises with some heuristic rules, and then adopt a domain-dependent strategy to trim the encyclopedia structure. Finally, we integrate entities from the trimmed structure into the original ontology, and construct a refinedlyenriched ontology. The experimental results show that...

متن کامل

Extending a wordnet framework for simplicity and scalability

The WordNet knowledge model is currently implemented in multiple software frameworks providing procedural access to language instances of it. Frameworks tend to be focused on structural/design aspects of the model thus describing low level interfaces for linguistic knowledge retrieval. Typically the only high level feature directly accessible is word lookup while traversal of semantic relations...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001